Synopsis: Spaghetti Query

Let’s explore how making a single query for retrieving results in one go can create issues.

Let’s imagine the following scenario. Your boss is on the phone with their boss, and your boss waves to you to come over. They cover their phone receiver with their hand and whisper to you, “The executives are in a budget meeting, and we’re going to have our staff cut unless we can feed my VP some statistics to prove that our department keeps a lot of people busy. I need to know how many products we work on, how many developers fixed bugs, the average bugs fixed per developer, and how many of our fixed bugs were reported by customers. Right now!”

You leap to your SQL tool and start writing. You want all the answers at once, so you make one complex query, hoping to do the least duplicate work and, therefore, produce faster results.

Retrieving selective data for tracking bugs information

The numbers come back, but they seem wrong. You wonder: “How did we get dozens of products? How can the average fixed bugs be exactly 1.0? And was it the number of customers or the number of bugs reported by customers that the boss wanted? How can all the numbers be so far off?” Finally you realize: “This query will be a lot more complex than I thought.”

Your boss hangs up the phone. “Never mind,” they sigh. “It’s too late. Let’s clean out our desks.”

Objective: Decrease SQL queries#

One of the most common places where SQL programmers get stuck is when they ask, “How can I do this with a single query?”

This question is asked for virtually any task. Programmers have been trained to think that one SQL query is difficult, complex, and expensive enough, so two SQL queries are bound to be twice as bad. And any number of queries beyond that is simply out of the question.

Programmers can’t reduce the complexity of their tasks, but they want to simplify the solution. They state their goal with terms like “elegant” or “efficient,” and they think they’ve achieved those goals by solving the task with a single query.

Legitimate uses of the antipattern#

The most common reason we might need to run a complex task with a single query is that we are using a programming framework or a visual component library that connects to a data source and presents data in an application. Simple business intelligence and reporting tools also fall into this category, although more sophisticated BI software can merge results from multiple data sources.

A component or reporting tool that assumes that its data source is a single SQL query may have a simpler usage. Still, it encourages us to design monolithic queries to synthesize all the data for our report. If we use one of these reporting applications, we may be forced to write a more complex SQL query than if we had the opportunity to write code to process the result set.

If the reporting requirements are too complex to be satisfied by a single SQL query, it might be better to produce multiple reports. And if the boss doesn’t like this, we can remind them of the relationship between the report’s complexity and the hours it takes to produce it.

Sometimes, we may want to produce a complex result in one query because we need all the results combined in sorted order. It’s easy to specify a sort order in an SQL query. It’s likely to be more efficient for the database to do that than for us to write custom code in our application to sort the results of several queries.

Alternatives To Make Text Searchable
Antipattern: Solve a Complex Problem in One Step
Mark as Completed
Report an Issue